ML Topics
1. Learning Model (Chapters 2 and 3)
- Formal model with definitions
- Propositions’ statements
- Proof of Corollary 2.3
- Definition of PAC learnability with 0-1 loss
- Sample complexity for PAC learnability of finite hypotheses classes with 0-1 loss
- Definition of agnostic PAC learnability for general loss functions
- Definitions related to empirical and true error for general loss functions, etc.
- Definition of ε-representative sample (all details)
- Lemma 4.2: statement and proof with all details
- Definition of uniform convergence property: main idea (details of definition not required)
- Corollary 4.4 main idea (details of statement not required)
- Corollary 4.6: statement and proof with all details
3. Bias-Complexity Tradeoff (Chapter 5)
- No Free Lunch (NFL) theorem, NFL and prior knowledge: only main idea
- Corollary 5.2 : statement with all details (no proof)
- Approximation error, estimation error, complexity and error decomposition: all details
4. VC-dimension (Chapter 6)
- Restrictions, shattering, VC-dimension: definitions in detail
- Fundamental Theorems of Statistical Learning and bound on generalization: in detail, as on the slides (no proof)
5. Linear Models (Chapter 9)
- Linear predictors/models: definitions with all details
- Linear regression: definitions, matrix form, derivation best predictor, use of generalized inverse in detail (derivation generalized inverse: not required)
- R2: definition and interpretation in detail
- Linear classification: perceptron: definitions and algorithm in detail
- Proposition on perceptron convergence: only main idea (as in slide “Perceptron: Notes”)
- VC-dimension of linear models: in detail
- Logistic regression: all details
6. Model Selection and Validation (Chapter 11)
- Validation: all details, including statement of bound on generalization error using validation set (Theorem 11.1 , no proof)
- Validation for model selection: all details, including statement of bound on generalization error using validation set (Theorem 11.2 , no proof)
- Model-selection curve, train-validation-test split, k-fold cross validation: all details
- What if learning fails: main idea
7. Regularization and Feature Selection (Chapters 13 and 25 )
- Regularized Loss Minimization: all details
- l1 regularization, LASSO: all details
- Tikhonov Regularization, Ridge Regression, derivation of optimal solution for Ridge Regression: all details
- Subset selection, forward selection, backward selection, without and with validation data: all details
8. GD, SGD, and SVM (Chapters 14 and 15 )
- Gradient descent (GD): all details
- Stochastic gradient descent (SGD): all details
- Hard-SVM optimization problem: all details
- Hard-SVM: equivalent formulation, and quadratic formulation: main idea
- Definition of support vectors for hard-SVM: all details
- Soft-SVM optimization problem, hinge loss: all details
- SGD for solving soft-SVM (algorithm): all details
- Hard-SVM dual formulation: main idea
- SVM for regression: all details only for optimization problem and support vectors definition
9. Kernels and SVM (Chapter 16 )
- Definition of kernel: all details
- Kernel trick for SVM: all details
- Commonly used kernels for SVM: all details
10. Neural Networks and Deep Learning (Chapter 20 )
- Neuron, activation function, network architecture, point of view of one node, hypothesis set, matrix notation: all details
- General construction of NN for a given Boolean formula: all details
- Expressiveness of NNs: main idea
- Sample complexity, runtime of learning NNs: main idea
- Forward propagation algorithm: all details
- SGD and Backpropagation algorithm: main idea (pseudocode: only main structure)
- Regularized NNs: main idea
- CNNs: differences with standard NNs, convolution properties, convolutional layer, pooling layer, dropout, early stopping, data augmentation, cross entropy loss function: main idea
11. Clustering (Chapter 22 )
- Unsupervised learning introduction: all details
- Clustering definition and difficulties: main idea
- Model for clustering: all details
- k-means clustering and Lloyd’s algorithm: all details
- Complexity of Lloyd’s algorithm: all details
- k-means++: all details
- Linkage-based clustering: all details
- Silhouette score: all details